I gave this a look and I found an easy improvement as it is now is to increase the node capacity to around 9 or 10. It seems to perform much better than 3. After that it starts to bottleneck somewhere else.